skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Zhou, Siyu"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Guyon, Isabelle (Ed.)
    As the size, complexity, and availability of data continues to grow, scientists are increasingly relying upon black-box learning algorithms that can often provide accurate predictions with minimal a priori model specifications. Tools like random forests have an established track record of off-the-shelf success and even offer various strategies for analyzing the underlying relationships among variables. Here, motivated by recent insights into random forest behavior, we introduce the simple idea of augmented bagging (AugBagg), a procedure that operates in an identical fashion to classical bagging and random forests, but which operates on a larger, augmented space containing additional randomly generated noise features. Surprisingly, we demonstrate that this simple act of including extra noise variables in the model can lead to dramatic improvements in out-of-sample predictive accuracy, sometimes outperforming even an optimally tuned traditional random forest. As a result, intuitive notions of variable importance based on improved model accuracy may be deeply flawed, as even purely random noise can routinely register as statistically significant. Numerous demonstrations on both real and synthetic data are provided along with a proposed solution. 
    more » « less
  2. Abstract Due to their long‐standing reputation as excellent off‐the‐shelf predictors, random forests (RFs) continue to remain a go‐to model of choice for applied statisticians and data scientists. Despite their widespread use, however, until recently, little was known about their inner workings and about which aspects of the procedure were driving their success. Very recently, two competing hypotheses have emerged–one based on interpolation and the other based on regularization. This work argues in favor of the latter by utilizing the regularization framework to reexamine the decades‐old question of whether individual trees in an ensemble ought to be pruned. Despite the fact that default constructions of RFs use near full depth trees in most popular software packages, here we provide strong evidence that tree depth should be seen as a natural form of regularization across the entire procedure. In particular, our work suggests that RFs with shallow trees are advantageous when the signal‐to‐noise ratio in the data is low. In building up this argument, we also critique the newly popular notion of “double descent” in RFs by drawing parallels toU‐statistics and arguing that the noticeable jumps in random forest accuracy are the result of simple averaging rather than interpolation. 
    more » « less
  3. null (Ed.)
    Random forests remain among the most popular off-the-shelf supervised machine learning tools with a well-established track record of predictive accuracy in both regression and classification settings. Despite their empirical success as well as a bevy of recent work investigating their statistical properties, a full and satisfying explanation for their success has yet to be put forth. Here we aim to take a step forward in this direction by demonstrating that the additional randomness injected into individual trees serves as a form of implicit regularization, making random forests an ideal model in low signal-to-noise ratio (SNR) settings. Specifically, from a model-complexity perspective, we show that the mtry parameter in random forests serves much the same purpose as the shrinkage penalty in explicitly regularized regression procedures like lasso and ridge regression. To highlight this point, we design a randomized linear-model-based forward selection procedure intended as an analogue to tree-based random forests and demonstrate its surprisingly strong empirical performance. Numerous demonstrations on both real and synthetic data are provided. 
    more » « less
  4. Abstract The synthesis of cone‐shaped Pt nanoparticles featuring compressively‐strained {111} facets by depositing Pt atoms on the vertices of Pd icosahedral nanocrystals, followed by selective removal of the Pd template via wet etching, is reported. By controlling the lateral dimensions down to ca. 3 nm, together with a thickness of ca. 2 nm, the Pt cones show greatly enhanced specific and mass activities toward oxygen reduction, with values being 2.8 and 6.4 times those of commercial Pt/C, respectively. Both the strain field and the observed activity trend are rationalized using density functional theory calculations. With the formation of ultrathin linkers among the Pt cones derived from the same Pd icosahedral seed, the interconnected Pt cones acquire stronger interactions with the carbon support, preventing them from detachment and aggregation during the catalytic reaction. Even after 20 000 cycles of accelerated durability test, the Pt cones still show a mass activity 5.3 times higher than the initial value of the Pt/C. 
    more » « less
  5. Abstract This article describes a systematic study of the oxidative etching and regrowth behaviors of Pd nanocrystals, including single‐crystal cubes bounded by {100} facets, single‐crystal octahedra and tetrahedra enclosed by {111} facets; and multiple‐twinned icosahedra covered by {111} facets and twin boundaries. During etching, Pd atoms are preferentially oxidized and removed from the corners regardless of the type of nanocrystal, and the resultant Pd2+ions are then reduced back to elemental Pd. For cubes and icosahedra, the newly formed Pd atoms are deposited on the {100} facets and twin boundaries, respectively, due to their relatively higher energies. For octahedra and tetrahedra, the Pd atoms self‐nucleate in the solution phase, followed by their growth into small particles. We can control the regrowth rate relative to etching rate by varying the concentration of HCl in the reaction solution. As the concentration of HCl is increased, 18‐nm Pd cubes are transformed into octahedra of 23, 18, and 13 nm, respectively, in edge length. Due to the absence of regrowth, however, Pd octahedra are transformed into truncated octahedra, cuboctahedra, and spheres with decreasing sizes whereas Pd tetrahedra evolve into truncated tetrahedra and spheres. In contrast, Pd icosahedra with twin boundaries on the surface are converted to asymmetric icosahedra, flower‐like icosahedra, and spheres. This work not only advances the understanding of etching and growth behaviors of metal nanocrystals with various shapes and twin structures but also offers an alternative method for controlling their shape and size. 
    more » « less
  6. Abstract We report for the first time that Pd nanocrystals can absorb H via a “single‐phase pathway” when particles with a proper combination of shape and size are used. Specifically, when Pd icosahedral nanocrystals of 7‐ and 12‐nm in size are exposed to H atoms, the H‐saturated twin boundaries can divide each particle into 20 smaller single‐crystal units in which the formation of phase boundaries is no longer favored. As such, absorption of H atoms is dominated by the single‐phase pathway and one can readily obtain PdHxwith anyx in the range of 0–0.7. When switched to Pd octahedral nanocrystals, the single‐phase pathway is only observed for particles of 7 nm in size. We also establish that the H‐absorption kinetics will be accelerated if there is a tensile strain in the nanocrystals due to the increase in lattice spacing. Besides the unique H‐absorption behaviors, the PdHx(x=0–0.7) icosahedral nanocrystals show remarkable thermal and catalytic stability toward the formic acid oxidation due tothe decrease in chemical potential for H atoms in a Pd lattice under tensile strain. 
    more » « less